A Framework for Trace Clustering and Concept-drift Detection in Event Streams

نویسندگان

  • Sylvio Barbon Junior
  • Gabriel M. Tavares
  • Paolo Ceravolo
  • Ernesto Damiani
چکیده

Concept-drift is a well-known problem that affects data streams where the underlying relations between a recorded tuple x and a system response y change over time [1]. Ignoring concept-drift can lead to a deterioration in the quality of predictive analytics and its capacity to represent the most recent concepts. Nevertheless, implementing a concept-drift adaptation strategy is not a trivial task due to different types of concept-drift and different adaptations in response to them. In this work, we discuss the Concept-Drift in Event Stream Framework (CDESF) that addresses some of these challenges for Trace Clustering (TC) [2] in data streams. Instead of creating an additional level of complexity that isolates the final user from the deep behaviour of the system, our goal is to offer a simple instrument to supervise concept-drift and tracking the evolution of clusters over time.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Concept drift detection in event logs using statistical information of variants

In recent years, business process management (BPM) has been highly regarded as an improvement in the efficiency and effectiveness of organizations. Extracting and analyzing information on business processes is an important part of this structure. But these processes are not sustainable over time and may change for a variety of reasons, such as the environment and human resources. These changes ...

متن کامل

Concept drift detection in business process logs using deep learning

Process mining provides a bridge between process modeling and analysis on the one hand and data mining on the other hand. Process mining aims at discovering, monitoring, and improving real processes by extracting knowledge from event logs. However, as most business processes change over time (e.g. the effects of new legislation, seasonal effects and etc.), traditional process mining techniques ...

متن کامل

Incremental entropy-based clustering on categorical data streams with concept drift

Clustering on categorical data streams is a relatively new field that has not received as much attention as static data and numerical data streams. One of the main difficulties in categorical data analysis is lacking in an appropriate way to define the similarity or dissimilarity measure on data. In this paper, we propose three dissimilarity measures: a point-cluster dissimilarity measure (base...

متن کامل

Detecting Concept Drift in Data Stream Using Semi-Supervised Classification

Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...

متن کامل

Handling adversarial concept drift in streaming data

Classifiers operating in a dynamic, real world environment, are vulnerable to adversarial activity, which causes the data distribution to change over time. These changes are traditionally referred to as concept drift, and several approaches have been developed in literature to deal with the problem of drift handling and detection. However, most concept drift handling techniques, approach it as ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017